This section covers the Collaborator database schema and some special features of the database we created specifically to support external custom reporting applications.
The Database is Read-Only!
Never change data in the database directly. Although we support read-only access to the database for reporting and automation, if you make changes to data in the database yourself you could irreparably destroy the integrity of the database.
Database Schema
The diagram below shows only the major tables in the Collaborator database and which files should be linked with which other fields when creating queries.
There are more tables which are either (a) purposefully undocumented or (b) described below but which do not have relationships to other tables. The diagram does not contain those tables in the interest of simplicity:
|
Here is a brief description of each table:
Table |
Description |
---|---|
|
Activity log of users' actions during review. This is used to compute metrics such as inspection rate. Each row represents a slice of time where the user was active. There might be many slices for a given user in a single review. Each slice includes a duration (in seconds) and a "start time" that is encoded both as a database date/time ( The activity action code ( It almost always best to use the reporting views to access activity data so you do not get your query wrong. The technique and motivation behind this system is described in our metrics section. |
|
List of user <--> review combinations. For each user who was "assigned" to a review, notes the role associated with that user. The You should not depend on this field as we might change it in the future. Refer to the |
|
Every time a set of files is uploaded to the server a "changelist" record is created. Most changelists will be associated with a review through If the changelist is of files from a local hard drive, the "local GUID" field will contain a globally-unique made-up identifier for that upload. If the changelist was derived from something already checked into version control (for example, a Perforce or Subversion changelist), then this field will be blank. The date, author, and check-in comment are all listed if known. To the extent possible this will match usernames with version control. Also linked is the SCM table ID. This ties the changelist to a specific version control server. Changelists from different servers might match in other details but are actually unrelated. If the changelist happened to have an associated identifier in a version control system, that is also recorded. Otherwise that field is blank. |
|
Represents a comment made by a user in some conversation. This includes not only actual chat but indirect events such as "marked read" and "created defect" and "new file uploaded". Also included is the file ( |
|
Represents a defect made by a user in some conversation. Also included is the file ( A state field indicates whether the defect is still open or fixed. We will be adding more state to this field in the future. |
|
Holds basic change metrics for file versions. Each You should only depend on the values of this table for versions directly associated with changelists. The other metrics are often incomplete. There are technical reasons for this; we will not be changing this behavior. |
|
Represents a group. Some of the groups are defined in the web UI and some are built-in internal groups automatically defined by Collaborator |
|
Joins the |
|
Joins the |
|
The behavior of this table is intentionally undocumented. |
|
Joins reviews and changelists. A changelist can be associated with zero, one, or many reviews, and a review can be associated with zero, one, or many changelists. |
|
All of these tables have to do with "meta-data" which means any data where the data schema itself is dynamic. Most notably, all review and defect custom fields are a kind of meta-data. You should not use the meta-data tables directly. Their relationships are very complex and we change how they work regularly as we add more features. Instead, access meta-data through the reporting views described below. This contains all the information you need for custom fields and formats it nicely as an added bonus. |
|
Information for all the drop-down items in any custom field. Each item is matched to a particular custom field. The "title" is the text displayed to end users. A "sequence" number defines the order of the elements (the IDs are not an order). Items can also be individually enabled or disabled. |
|
Holds the history of notification messages that have been sent out to clients. Clients might choose (or not) to get notifications by email, RSS feed, and so on. This table is periodically cleaned out by the server. There will always be some backlog of events for each user (for example, for use in creating the RSS feed for a user), but you cannot depend on any particular number of events to be saved. |
|
Internal server use. Do not depend on this table. |
|
Holds one record for each review in the system. The "creator" is the user who created the review, or the system administrator if the review was created automatically. Use the review custom field view to access review custom field data. |
|
Internal server use. Do not depend on this table. |
|
Represents all of the roles from all role-sets. Each role has a "standard" name that never changes and the custom name that was set by the user. |
|
Contains one record for each SCM server that has ever been reported by a client. It is OK if there are duplicate records for a given SCM system. This separates changelists from different systems. This table will probably change in the future. |
|
One record for each user who can log into the system. User ID 1 is the special system administrator. Key user information and preferences are stored here. Additional user preference information is stored as meta-data and is accessible from the special The Passwords are stored in hashed form so that a casual observer cannot deduce a password. If you need to reset a password, set this field to:
|
|
One record for every version of every file that has ever been uploaded to the server. Join with Each version includes the full file path to the original document. If this file was retrieved from version control, this will be the version control server path, not the path on the user's local hard drive. The version name is the version control-specific name of the version of the file. This is typically a number or set of numbers. The version change-type indicates whether this represents a file addition, deletion, modification, and so on. Sometimes the system does not know. You should use this as a guide but not depend on it because there are exceptions to the "type" rules and we add new types periodically. The current values are:
Sometimes the version will have a "previous" version. This typically means the version that came before it in version control. This is used internally and is tricky; there are lots of exceptions and we can change exactly what this means. The content MD5 is the MD5 sum of the raw content of the file. This can be used to link a version with the on-disk file content stored in the content cache. It can also be used as a check to see whether two versions are identical. |
Reporting Views
For databases that support the concept of a "View," Collaborator creates a set of Views specifically for the purpose of external report-writers. You should use these Views whenever possible; we will make sure that the definitions of these Views remain the same even if we change the database schema in future versions.
You can also use these Views as a guide for how to create other custom queries.
Here are the special reporting Views:
View |
Description |
---|---|
|
Contains columns from the assignment table which will be maintained in the event of a future schema change. |
|
Contains names and descriptions for the codes used in the |
|
One row per review reporting file path, lines of code reviewed, and number of defects. |
|
These are the custom fields you have defined for defects, one row for each defect. If you change the custom field definition, the layout of this table will change as well (automatically, and immediately). Warning: Because the exact custom field titles are used for column names, if you change the title of a custom field it will change the definition of this view. With review workflows, each review might have a different subset of custom fields. In this case fields are |
|
Same as |
|
This table contains a list of the possible values as defined in drop-down and multi-select Defect Custom Fields. |
|
Contains "defect state" codes and names. Join on this table if you would like to display more user-friendly defect state names. |
|
Contains columns from the defect table which will be maintained in the event of a future schema change. |
|
Shows Participant Custom Field values that users have selected in reviews. This table only shows values for "Single-Line Text" type custom fields. A NULL value means that a custom field was defined for a review's template, but the user did not specify a value. |
|
Shows Participant Custom Field values that users have selected in reviews. This table only shows values for "Multi-Line Text" type custom fields. A NULL value means that a custom field was defined for a review's template, but the user did not specify a value. |
|
Shows Participant Custom Field values that users have selected in reviews. This table shows values for "Drop-Down" and "Multi-Select" type custom fields. A NULL value means that a custom field was defined for a review's template, but the user did not specify a value. In the case of Multi-Select fields, if a user selected multiple values then multiple rows will appear in the results, one for each selection. |
|
Represents the various phases a review can be in. |
|
For each unique combination of review, user, and role, reports the person-hours spent. This includes data for current review participants only. If a user was a participant but is not now, that person will not be included in this result. However that time will be included in the |
|
One row per review reporting author, reviewer, rework hours, and total person-hours spent in the review.
|
|
One row per review reporting author, reviewer, and total number of comments made in the review. |
|
One row per review reporting the number of defects created in that review. |
|
One row per review reporting a variety of standard metrics such as inspection rate, defect rate, defect density, and the individual numbers used to form those ratios. Some values will contain |
|
Lists all versions actually associated with a review, although with the associated review. |
|
One row per review reporting the number of files and line metrics (total, added, modified, removed) for all files in the review. |
|
These are the custom fields you have defined for reviews, one row for each review. If you change the custom field definition, the layout of this table will change as well (automatically, and immediately). With review workflows, each review might have a different subset of custom fields. In this case fields are |
|
Same as |
|
This table contains a list of the possible values as defined in drop-down and multi-select Review Custom Fields. |
|
Contains columns from the review table which will be maintained in the event of a future schema change. |
|
Contains columns from the role table which will be maintained in the event of a future schema change. |
|
Additional user preferences, one row per user. Most user information and preferences are stored in the |
|
Same as |
|
Contains columns from the user table which will be maintained in the event of a future schema change. |